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ABSTRACT 


Assume the error variance of the process, prior probabilities of the 
models being correct, and prior multivariate normal distributions on the 
parameters of the models are specified. 

A rule for termination of sampling is proposed. Upon termination, 
the model with the largest posterior probability is chosen as correct. 

If sampling is not terminated, posterior probabilities of the models and 
posterior distributions of the parameters are computed. The next experi- 
ment chosen is that which maximizes the expected Kullback-Leibler inform- 
ation function. Monte-Carlo simulation experiments were performed to in- 
vestigate large and small sample behavior of this sequential adaptive 
procedure. 
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KULLBACK-LEIBLER INFORMATION FUNCTION AND THE SEQUENTIAL 
SELECTION OF EXPERIMENTS TO DISCRIMINATE AMONG 
SEVERAL LINEAR MODELS 

by Steven M. Sidik 
Lewis Research Center 
SUMMARY 

Assume that a finite set of potential linear models relating several 
controlled variables to an observed variable is postulated and that ex- 
actly one of these models is the true model. The problem is to sequenti- 
ally design most informative experiments so that the correct model can be 
determined with as little experimentation as possible. We assume that 
the error variance of the process is known. In addition, we assume the 
statistician possesses prior information which can be expressed as the 
prior probability that each of the proposed models is indeed the correct 
model and prior multivariate normal distributions on the parameters of 
each of the postulated model equations. After each stage of sampling, 
the prior distributions and the observed data values are used to compute 
posterior probabilities of the models being the true one and posterior 
distributions on the parameters of the models. Then sampling is termin- 
ated if either a prespecified number of observations has been taken or 
if any of the posterior probabilities of the models achieves a prespeci- 
fied value. Upon termination of sampling, the model with the largest 
posterior probability is chosen to be the correct model. If sampling is 
not to be terminated, the next experiment chosen is that one in the set 
of allowable values of the controlled variables which maximizes the ex- 
pected Kullback-Leibler information function based upon the current pos- 
terior probabilities and distributions. 


1 



2 


An analytical study of this procedure is too complex and difficult 
to adequately achieve. Hence, a number of Monte-Carlo simulation exper- 
iments were performed to obtain information about the performance of this 
adaptive design procedure. Two basic types of Monte-Carlo experiments 
were performed. In the first, one of the models was chosen to be used to 
generate the random observations using known. fixed values for the param- 
eters. Then a large number of observations were taken using the Kullback- 
Leibler information functions as a criterion to choose the sequence of ex- 
periments. If was found the posterior probability of the chosen model 
relatively rapidly approaches the value of 1.0 and then fluctuates near 
1.0. The posterior mean of the parameters of the correct model also rap- 
idly approaches the known fixed values used to generate the observations. 
In the second type of experiment, one of the models was chosen to be used 
to generate the random observations. Then for various combinations of the 
maximum number of observations, stopping criterion, prior distributions 
of the parameters, and error variance of the process, a large number of 
repetitions of the sequential design procedure were executed. Then the 
observed probability of correct selection and average sample number were 
calculated based upon the number of times the procedure chose the correct 
model and the number of observations taken until termination. 

INTRODUCTION 

The general linear model has become one of the most useful statis- 
tical tools available to the modern scientific experimenter. There have 
been many books and papers written about techniques for choosing the 
appropriate or "best" linear model to fit to a set of data already 
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'collected. In general, these have been methods of hypothesis testing to 
determine which of a set of specified terms in a model equation may be 
dropped from the model. Much work has also been done with regard to the 
problem of designing best or optimal experiments to estimate the param- 
eters of specified model equations. 

In this report we study a sequential adaptive experimental design 
procedure for a related problem. (This paper is a summary of material 
from Sidik (ref. 1).) Assume that a finite set of potential linear 
models relating a finite set of controlled variables to an observed var- 
iable is postulated and that exactly one of these models is correct. 

The problem is to sequentially design most informative experiments so 
that the correct model equation can be determined with as little experi- 
mentation as possible. We also assume that the error variance of the 
process is known. In addition, we assume that the statistician possesses 
prior information which can be expressed by the prior probability that 
each of the proposed models is indeed the correct model and prior multi- 
variate normal distributions on the parameters of the various models. We 
then derive an adaptive procedure for designing the successive experiments 
using the Kullback-Leibler information function to maximize the antici- 
pated information for discriminating among the models. That is, after 
each stage of sampling, the prior distributions and the observed values 
are used to compute posterior probabilities of the postulated models being 
correct and posterior distributions on the parameters of the models. Then 
if sampling is not to be terminated, the next experiment chosen is that 
which maximizes the expected Kullback-Leibler information based on the 
current posterior probabilities and distributions. Sampling is terminated 
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whenever either a prespecified number of observations is finally taken or 
whenever any of the posterior probabilities of the models achieves a pre- 
specified value. Upon termination of sampling, the model with the largest 
posterior probability is chosen to be the correct model. 

An analytical study of this procedure is too complex and difficult 
to adequately achieve. Hence, a number of Monte-Carlo simulation experi- 
ments were performed to obtain information about the performance of this 
adaptive design procedure. Two basic types of Monte-Carlo experiments 
were performed. In the first, one of the models was chosen to be used to 
generate the random observations using known fixed values for the param- 
eters. Then a large number of observations were taken using the Kullback- 
Leibler information as a criterion to adaptively choose the sequence of 
experiments. It was found the posterior probability of the chosen model 
relatively rapidly approaches the value of 1.0 and then fluctuates near 
1.0. The posterior mean of the parameters of the correct model also 
rapidly approach the known fixed values used to generate the observations. 
In the second type of experiment, one of the models was chosen to be used 
to generate the random observations using known fixed values for the pa- 
rameters. Then for various combinations of the maximum number of obser- 
vations, probability stopping criterion, assumed prior distributions of 
the parameters, and error variance of the process, a large number of rep- 
etitions of the sequential design procedure were executed. Then a prob- 
ability of correct selection and average sample number were calculated 
based upon the number of times the procedure chose the correct model and 


the number of observations taken until termination. 
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Lindley (ref. 2) was one of the first to consider the general idea 
of applying information concepts to the problems of statistical infer- 
ence. He modified the concept of entropy and developed a number of in- 
teresting general results on the amount of information in an experiment 
about the parameters of the distribution of a random variable. 

Stone (ref. 3) was one of the first to consider information con- 
cepts as applied to designing and comparing regression experiments. He 
used a Bayesian framework, but the problem he considers is that of pa- 
rameter estimation rather than that of model selection. 

Another early and more relevant paper is that of Chernoff (ref. 4) 
who applied the Kullbaek-Leibler information function to the sequential 
design of experiments when the cost of experimenting is small. His re- 
sults are valid for the case of two terminal decisions and a finite 
number of experiments and states of nature. These results have been gen- 
eralized by Albert (ref. 5) to an infinite number of states of nature and 
by Bessler (ref. 6) to an infinite number of experiments and k terminal 
actions. Kiefer and Sacks (ref. 7) have also provided some extensions. 

Hunter and Reiner (ref. 8) considered a sequential design procedure 
for discriminating between two model equations. Their procedure chooses 
the experimental conditions which, based upon maximum likelihood estim- 
ates of the parameters from the data already collected, separate the ex- 
pected values of the observed variable under the two models by as much 
as possible. 

Box and Hill (ref. 9) discussed the use of the Kullbaek-Leibler in- 
formation function, deriving it from considerations involving the entropy 
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function. They consider the use of the K-L information function to se- 
quentially discriminate among several mechanistic (nonlinear) model equa- 
tions. Besides the fact that they consider nonlinear models, their ap- 
proach is different from that considered here in the sense that although 
they do assume prior probabilities on the proposed models, and compute 
posterior probabilities from the observations, they assume the parameters 
of the model equations are known constants. 

Meeter, Pirie, and Blot (ref. 10) have done a number of computer 
simulations comparing the methods of Chernoff and of Box and Hill. They 
found that the Box-Hill procedure performed quite well on the examples in 
comparison to Chernoff 's procedure. It is interesting to note. that 
Chernoff seems to be the only one of these authors who defined an expli- 
cit rule for terminating sampling. Although Chernoff' s procedure is known 
to be asymptotically optimal, it is also known to require very large 
sample sizes. 

STRUCTURE OF THE LINEAR MODELS 

In the theory of the general linear statistical model, we are con- 
cerned with problems involving model equations relating K controlled 
variables (x^; k = 1, . . . ,K) to an observed variable (y) . The form of 
the model equation is required to be linear in the unknown parameters 

3^.. If n observations are made upon y we let x^ denote the value 

fch. 

of x^ at which the i observation is made. Thus, for the n obser- 
vations the model may conveniently be written as 

->* . 

y = MS + e (1) 


where 
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y 


* 


(y j_ * ^2 * 




( e l’ £ 2 * 


where the variances of e. are finite and the e. are uncorrelated. The 

1 x 

matrix M is called the design matrix for the experiment consisting of 
the n observations. The problem of experimental design is that of choos- 
ing the x.^ values in some "optimal" manner. 

In certain situtations in practice the experimenter can postulate sev- 
eral possible models involving different variables which correspond to sev- 
eral possible mechanistic or empirically based theories. They may lead to 
the various models containing different sets of x^. There may be some 
overlapping of the x^ among the models or there may be none. 

There are then two problems requiring solution. The first is that of 
choosing experiment designs which will enable the experimenter to decide 
which of the potential models is the correct one. Then, having chosen the 
model, the second problem is to estimate the parameters. The second 
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problem has many solutions using a variety of standard techniques. This 
report concerns itself with a method of designing experiments to provide 
information for choosing the appropriate model equation. 

We assume there are L different competing model equations. These 
models may be combined into one large possible model equation and then the 
L hypothetical models are equivalent to there being L hypotheses re- 
stricting certain sets of parameters of the large model to be a priori 
zero. For example, we might have two controlled variables x^ and . 

And suppose the model equations postulated are: 

H l : y = e i 1)x l + £ 

( 2 ) 

H 2 : y = ^2 *2 + £ 

H 3 : 7 = 3 1 3)x 1 + 6 2 3)x 2 + £ 

where B^ denotes the coefficient of controlled variable k in model 
equation H . A distinction must be made between the parameters in differ- 
ent models because although, for example, B^ and B^ are coefficients 
of the variable x^, their distributions need not be the same. This nota- 
tion is clumsy, however, and if we implicitly accept the fact that the dis- 
tributions of the B^ depend upon the model, we may more simply rewrite 


V 

y - 6 1 x 1 

4- e 

V 

y = S 2 x 2 

+ e 


H 3 : y = B 1 x 1 + & 2 X 2 + e 


the models as 
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We say that models 1 and 2 are nested within model 3. (We will find in 
the following work that the performance of the adaptive procedure and the 
behavior of the posterior distributions are quite dependent on the struc- 
ture of the nesting of models.) This is equivalent to writing one model 
as y = g^x^ + + e = ^ B + e and hypothesizing 

H : e 2 = 0 


H 2 : e i = 0 


H 3 : B 1 f ° 5& 2 * 0 


In this sense it is seen that the words model and hypothesis are inter- 
changeable and will be used interchangeably in the remainder of this re- 
port. The notation we adopt is that claims 

* • M A + 1 

where a ^ is the appropriate k^ x 1 vector of 3's from B which ap- 
pear in model & , and is the appropriate matrix of x's. 

We now precisely state the three basic distributional assumptions 
about the parameters and random variables of the models : 

(1) The vector e follows a multivariate normal distribution with 
mean 0 and precision matrix T. This is denoted by e ~ N(0,T). The 
precision matrix is the inverse of the covariance matrix of the distri- 
bution and we assume it is known. Since T must be positive definite 
definite symmetric, we need only consider the special case where T = tI 
since linear transformation of the y reduces all other cases to this 
one. Note, that this implies t is known. 
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(2) For each Z = 1, . . . ,L the prior distribution of a is 

S <1 ' K ^,oh,o> 


where y^ ^ and q are known. 


th 


(3) The prior probability that the l model is the correct model 

equation is assumed specified and denoted by 0^ q. We assume one and 

L 

only one of the models is correct and hence that ^ 0 = 1.0. 

X/ y U 


£=1 


We now describe the space A of allowable experiments in more detail 

If the number of elements of X is K, then a choice of experiment aeA 

is composed of the number of J of observations to take and J vectors 

from some subset of Euclidean k-space. The J vectors specify the values 

tti th 

of the controlled variables x_^* At the j experiment or j stage 
of experimenting the particular choice from A is denoted a^ . 

PREREQUISITE DISTRIBUTION THEORY 

In the remainder of this report, much use will be made of the distri- 
bution of the observed variable, the posterior probabilities of the models 
and the posterior distributions of the parameters of the model equations. 
We present only the notations for these distributions and define the ap- 
propriate probability density functions. The distributions are developed 
in reference 1 and can also be derived from results in DeGroot (ref. 11) 
and Raiffa and Schlaifer (ref. 12). 

Let f ^ (y I a j+q denote the density function of the vector 

->• 

y_. + ^ under when the parameter values are given by at stage 

j + 1 of sampling and the experiment is a j+]/ Let t * ie probability 

4 - . 4 . 

density function of a after j stages of sampling be denoted . (a) . 

sl a ,3 
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This is a preposterior density since it serves as the posterior density 


of a. after j stages of sampling and the prior density of a before 


, st 


the j + 1 stage of sampling occurs. 


Lemma 1: After j stages of sampling, a follows a multivariate normal 

distribution with mean vector y . and precision matrix i 1 . . . That is, 

. ^ >3 


after j stages of sampling . 


where 




V. . = V. . . + M. .TM. . 
^ » j ^ » 3 L ^ > J ^ » J 


= ¥ 


, >0 + d 

1=1 


( 2 ) 


and 


1,3 




= ? ;!j (£ + f £.o^,o) 


( 3 ) 


and where M . denotes the design matrix specified by a. under H. 




(For proof see ref. (1).) 


We now turn to determining the distribution of y This is done 

in two stages. First, we do not know which of the models is in fact the 

**v 

correct one. Then for any given model, we do not know the value of . 

Let f„(y. ,, a.,, ,a) denote the distribution of y... under H„ when 
£ j+1 1 j+1 j+1 £ 

experiment a eA is performed and a is specified. Since we do not 

j-r± X/ 

know we must average this distribution over all Let 
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£,(y.,i a,,,) denote the mixture of the densities f « (y . , , I a„ . .. ,a) with 
a j+i 1 j+i & ^j+i 1 j+i 

respect to the marginal posterior of a^. 

Lemma 2: The conditional distribution of yj given and aj is a 

multivariate normal distribution with mean vector s . and precision 

& y J . 


matrix R^ ^ where 


R t.j ' 1 


I - M CM' TM 4- V ) M T 


(4) 


S «-,j R A,j™4,j (M A,j™A,j + 'Vj-lVj-l 

(For proof see ref. 1.) 

Since the true model is unknown we now compute the mixture of the 
distributions of Lemma 2 with respect to the probabilities 0 . - as 

f(y j |a : ) ■ u Vj-iV^V 


(5) 


( 6 ) 


To compute the posterior probability of each model being correct 
after the observation y^ + ^ is obtained, we apply Bayes theorem directly 
to get 


= M y j+J a j+l )9 £,j 

x. , j+1 L 

I v f k (y j+ll a j+l )8 k,j 
k=l J 


(7) 


ENTROPY FUNCTIONS AND THE KULLBACK-LEIBLER INFORMATION FUNCTION 


When comparing a number of experiments to determine which is the op- 
timal one to perform, one must define optimal. In this report, that ex- 
periment which yields the largest expected K-L information is defined 
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as the optimal experiment. In particular, let I(w,a) denote the ex- 
pected K-L information as a function of the experiment a and the cur- 
rent state w (i.e., the current values of the 0 , p , and of the 

process. This function will be specified explicitly later. In this sec- 
tion, we first describe how the K-L information arises from attempting 
to reduce the entropy of the probabilities of the models. We then develop 
an expression for I(w,a) and finally discuss the operational meaning of 
the use of I(w,a) from a heutristic point of view. 

Development of the K-L Information Function 
The problem under consideration here is that we must choose one of a 
set of postulated model equations. For each model we have the posterior 

probability 0 . that it is the correct one. We would like to choose 

*■ > J 

experiments which cause the posterior probability of the correct model 

to increase most rapidly. An indirect method of accomplishing this is 

to choose experiments which most rapidly decrease the entropy of the set 

of probabilities 0. .. The entropy is defined as 

> J 

L 

’ - D 

i-1 

It can be verified that the entropy attains a maximum when all the prob- 
abilities are equal and attains a minimum when any one of the probabil- 
ities is one and the rest are zero. 

Box and Hill (ref. 9) proposed the use of the expected decrease be- 
tween the entropy at the current stage of sampling and the anticipated 
entropy at the next stage of sampling as the criterion for selection of 
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experiments. They found, however, that the entropy function is quite in- 
tractable analytically and applied a well-known inequality to show the 

expected K-L information function provides an upper bound on the reduc- 

tion of entropy. Let 8^(y|w,a) denote the posterior probability of 
model i if the value y is observed when the state was w. Let w(y) 

denote the state of the process after observing the value y when it was 

in state w. Then the anticipated entropy is given by 

0 

Thus, if the current state of the sampling process is w, and the experi- 
ment aeA is performed, the expected decrease in entropy, R(w,a), is de- 
fined as 

R(w,a) = <£( w) - E{(f[w(y) ,a] } 

9 i (y|w,a)lnp i (y|w,a^l * 



by application of the following inequality (Kullback (ref. 13), p. 15) 



(y|w,a)ln 0 £ (y|w,a)| S f(y|w,a)dy 
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L> 

0.f £ (y|w,a)ln 

i=l 


f (y|w,a) 


f ± (y |w,a) 


>. f^(y|w,a)ln 


f £ (y |w,a) 


9 k f k (y I w » a > 

k=l 


Let 



I(w,a,i,j) = / f i (y|w,a)ln 


f ± (y |w,a) 


fj (y|w,a)J 


dy 


(9) 


We note I(w,a,i,j) is defined as the expected amount of information 

in the observations from experiment a for discriminating against H„ in 

i ' 1 

favor of Ik. Let J*(w,a) denote the matrix whose i,j element is 
I(w,a,i,j). Then the inequality (8) may be written as J 


R(w,a) <. 0 6^(w,a)0 = I(w,a) 


( 10 ) 


Meeter et al (ref. 10) proposed the following heuristic argument in favor 
of using I(w,a). If one knew that Ik were indeed the correct hypoth- 
esis and wished to maximize the information about all for k ^ i, 

then it would be natural to maximize 


0 k I(w,a,i,k) 

k*i 

But since Ik is assumed correct only with probability 0^, it is equally 
natural to multiply the foregoing expression by 0^ and sum over i to 
obtain the anticipated information. But in doing this, one does end up 
with I(w,a). 

Evaluation of K-L Information Function 
From Lemma 2 we have (if y is Jxl) that the density of y under 
is given by 
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. i. 


£,(?la) - (2,)- J / 2 | R); l 1 ' 2 e ' ^ R * (y ' S * >/2 


Hence 


f m ( yl a) 


- (y-s ) R (y-s )/2 

Ir 1 1/2 ! r I-!/ 2 e m T m m 
1 m' 1 V - (f-s )’R (f-f )/2 

y n n J n 


Moreover 

ry?U) 


In 


f n (y l a) 


= \ (In 1 R | - ln|R j) - j (y - s ) R (y - s ) 
2 m ' n 1 2 mm m 


and 


”1 I 

+ 2 (y - s „ ) V y - s n> (11) 


I(w,a,m,n) 



f m (y|a)dy 


N(s.R ) 

m m 


(12) 


Note 


where the expectation is taken under the assumption y 
that I(w,a,m,m) = 0.0 for m = 1 , . . . ,L. 

It can be shown (ref. 1 ) that 

I(w,a,m,n) =y(lnjR | - ln|R j) -y J + y tr(R R" 1 ) +7 (s -s )'r (s -s ) 
2 ' m ' n 'J 2 2 nm 2 m n nm n 


( 13 ) 


And 
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I(w,a,m,n) + I(w,a,n,m) = -J + — tr^R R M + trf R R ^ 

\ n m j y m n 


1 17i 

2 L 


+ f j(s - S )'R (s - 8 ) + (8 -8 )’R (8 -8 ) 
0 1 m n n m n T n m m n m 


= -J + 


\ tr( R R 1 
2 \ n m 


h - 1 


+ trf R R 
l m n 


I g 


+ £ \(s - s )’(R + R )(s 

m n m n m 


- 


(14) 


Equation (14) is given in slightly different form in Kullback (1968, 
p . 190) . Thus , 

L n-1 

X(w,a) = 0 n 0 m [I(w,a,m,n) + I(w,a,n,m)] 

n =2 19=1 


L n-1 


= ) , V 0 0 <-J + - tr (r R -1 ) + tr/R R 1 
/ / / / n m] 2 \nrny \mn / 


n =2 m=l 


+ j - V< R m + 

j m n. m ri m n 


L n-1 


= - J EDVn + lC e n" 


n =2 m=l 


n=l 



R 


-1 


n 


L n-1 

+ 7 ^ r 6 8 (s - s ) ’ (R + R ) (s - s ) 
2 / / / / n m m n m n m n 

n =2 m=l 


(15) 


The last form of this equation appears to be the most convenient for com- 
puting purposes . 
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Intuitive Analysis 

Looking at the computing form of equation (15) it can be seen that 

L n 

there are three terms. The first term is -J /, 0 0 . The value 

m n 

n=2 m=l 

of this term does not depend upon a and hence has no effect upon the 
choice of a. From this consideration we note that computing the value 
of this term would not be beneficial if only one more stage of experimen- 
tation is available. 

The third term of the sum is a weighted sum of the quadratic forms 

(s - £3 ) ' (R + R ) (s - s ) 
m n m n m n 


Thus, this term is a separating function in the sense that these quadratic 
forms will be maximized when the pairs of expected values of y under the 
various hypotheses are as far apart as possible in comparison to the pre- 

"V "V 

cisions of y. If the precisions of R and R are large then s and 

m n m 

s n do not need to be far apart to provide much information whereas if 

• 4 ' -~y 

these precisions are small then the expected values s m and s n must be 

further apart to provide the same information. The weighting factors are 

the products 0 0 . Thus, when 0 and 6 are both small, 00 is very 
nm n m nm 

small and the information due to the separation of s and s is dis- 

n m 

counted somewhat. If 0 and 0 are large then the information due to 

n m & 

—y ■-*- 

separation of s n and is given more importance. Thus, this third 

term causes experiments to be chosen which separate the expected values of 

~y g 

y under the respective hypotheses which are still in serious contention 
for being chosen. 
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It is interesting to note that some authors (Hunter and Reiner 
(ref. 8), e.g.) have proposed criteria for selection of experiments in- 
volving only distances between expected values. In a later paper, Box 
and Hill (ref. 9) proposed that the distances as such are not important, 
but the distances weighted by some function of the variability about the 
expected values are important. It is seen here that the expected K-L 
information function does just that. 

1 

The second term in equation (15) is — £_/ 6 tr 

n=l n 

This can be thought of as a weighted sum of ratios of precisions. If 
only one y value is to be observed, this component becomes 




(16) 


It would be interesting to see when this term is maximized. Upon taking 
partial derivatives of equation (16), setting to zero, and simplifying, 
one arrives at the following set of simultaneous nonlinear equations. 



k=l 


It can be immediately seen that one solution to this system is 
R^ = = . . . = R^. This solution implies that the experiments should 

tend to give the same precision for the expected value of y under each 
hypothesis. This term is not considered any further here. 

In summary, it can be seen that the expected K-L information func- 
tion in this case is basically a rather simple separating function. One 
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would be hard pressed to construct a much simpler separating function 
which has more intuitive appeal. 

THE SEQUENTIAL DECISION PROCEDURE 
Three components are required for a sequential adaptive decision 
procedure; (1) a rule which determines if sampling should be terminated 
or continued, (2) a rule which specifies the experiment to be performed 
given the current state of the system, and (3) a rule which selects the 
model equation which will be claimed to be true when sampling is termin- 
ated. 

Experiment Selection Rule 

The procedure adopted for this paper is the so-called myopic pro- 
cedure. This rule simply chooses as the next experiment that one which 
maximizes the anticipated K-L information for the next stage only. 

We assume that an upper limit, J MA ^, to the number of observations 
is specified. This number may be infinite. An allocation of the obser- 
vations to the stages of sampling is described by a J^^^^xl vector n, 
where n^ gives the number of observations at stage i. The question 
arises as to how the observations should be allocated. That is, should 
all J,, A „ be taken at once, strictly one-at-a-time , or in different 
sized groups. As the first step in answering this, let A_. denote the 

set of experiments in A which specify that exactly j observations 

* 

should be taken. For any given state w, let a^ (w) denote the element 

of A. such that 
1 

* 

I[w,a.(w)] = sup I(w,a) 

J acA. 

3 
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Lemma 3 : For any v 


iT ran e 


SUCu 


nVi 4-Vi o +* 


LUUL J- 


j W£ lISVS 


I [w , a . (w) ] >_ I [w , a . (w) ] . 

3 

<fe 

Proof : We introduce the following notation. Let y, (a.), k = 1 , . . . ,i 

" 1C 1 

* * 

denote the random variables observed under a.fw) and y. (a.), 

x ' k 3 

* 

k = 1 , . . . , j denote the random variables observed under (w) . Define 
another experiment a^eA^ by choosing the first j observations accord- 

; k 

ing to a_. and the remaining i - j observations according to the last 
* 

i - i of a. . This leads to the random variables 
i 


* 


W * 


l y k (a i ) 


k = 1, - . . ,j 


k j 1 * . • . ,i 


Because I(w,a,m,n) is positive definite and is additive for independent 
observations 


Thus 


iCWja^jirijn) >. I(w,a_.,m,n) 


I(w,a^) = 0 ' [I (w,eL ,m,n) ] 0 >. 0 [I(w,a ,m,n)]0 = I(w,a^) 


But by definition I(w,a^) >. I(w,fL) and hence 


* * 

I(w,a. ) >_ I(w,a ) 
1 J 


Q.E.D. 


The lemma simply proves that the optimal experiment with more obser- 
vations will be expected to provide at least as much information as the 
optimal one with fewer observations. In determining an allocation one 
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should also consider the cost of experimenting. In particular, if we 
assume that each observation has a constant cost associated with it, then 
it is reasonable to choose the experiment which maximizes 


~ I (w , a . ) 
JJ J 

Thus, prior to stage k let m 
timal experiment is the element 
yields 


j • • *’ J MAX 


k-1 

D n. and assume m < . The op- 

1 MAX 

1=1 

a*eA which for the current state w, , 

k-1 


MAX 


j = 1» 


,J 


MAX 


MAX 


- m aeA. 


T I( Vr a) ! 


If sampling has not been terminated by the rules developed in the next 

section then we stop when £n . = J and select the model according to 

i MAX 

the rules in the next section. 

Stopping and Model Selection Rules 
We now discuss the problems of determining which of the postulated 
models is the true one and determining when the results of the experi- 
ments are sufficiently informative to stop sampling and make the choice. 

Box and Hill (ref. 9) suggested that for their procedure, experi- 
menting be terminated whenever one model is clearly superior to the others. 
This is obviously a reasonable statement but it is in need of formal 
definition before it can be used as a stopping and selection rule. 

(1) Stopping rule : Let 0 be some specified value 

1/L < 0 ^ <. 1.0. This value is the probability stopping criterion. Let 

J MAX 


denote the maximum number of observations permitted. Then terminate 
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MAX 

sampling whenever either . , T {0.} >. 6 . or J w .„ observations have 

i=l, L l min MAX 

been taken, whichever occurs first. 

(2) Model selection rule : Upon termination choose the correct model 

MAX 


to be where 0^ 


. . T { 0 . } . 
1=1, L l 


SOME COMPUTER SIMULATION RESULTS 
General Simulation Procedure 

The sequential procedure proposed consisted of (1) an experiment 
termination rule, (2) an experiment selection rule, and (3) a model, sel- 
ection rule. Because of the mathematical complexity of the distributions 
involved (in particular, see eq. (6)) it was not feasible to analytically 
examine how well these rules work. The general procedure by which the 
Monte Carlo simulation technique was used to study performance is out- 
lined in the following algorithm. (The FORTRAN computer program is in- 
cluded in ref. 1.) 


1 . Input : 

>- 

J )l,0 


-»■ 

Mr 


¥ 


1,0 


1,0 


N 


0 . 
min 


MAX 

* 


the prior means of the parameters of the models 
the prior precision matrices of the parameters of the 
models 

the prior probabilities of the models being correct 
the number of simulations 
probability stopping criterion 
maximum number of observations 

the model which generates the observed variable (simulates 
choice by nature) 
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•7-JU 

y values of the parameters of the true model (simulates 

choice by nature) 

2 . n •*- 0 

3. PCS «- 0 

4. 0 (for j = 

5. j «• 0 

6. j + j + 1 

7. Determine optimal aeA as described in the section entitled 

"Experiment Selection Rule". Denote as a and let M * denote 

* * 

design matrix for model l when a is chosen. (All simula- 
tions in this report consider strictly one-at-a-time sampling for 
simplicity . ) 

8. y . •<- M *y 

J a 

9. Generate a pseudo-random observation from a N(0,x) distri- 

bution. (Simulates action by nature) 

10. y.^y.+e. 

J J J 

11. For 1 = 1, . . . ,L compute 0 . , ¥ and y 0 . from y. and 

,3 >3 * >3 J 

0. . . , f , and y. . 1 as described in the section entitled 
&,J-1 ^,j-l A, J-l 

"Prerequisite Distribution Theory". 

12. Find k such that 0, : . = MAX{8. .} 

k »3 i 1,3 

13. If j >. or 0. . > 0 . go to 14. Otherwise go to 6. 

MAX k,j — min 

14. N . N . +1 

3 3 

15. If k » £ ; PCS ■*- PCS + 1 

16 . n n + 1 

17. If n >. N go to 18. Otherwise go to 5. 
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18. PCS +- PCS/N 



20. Stop 


Upon stopping, the value of PCS is the observed probability of cor- 
* 

rectly choosing £ as the true model for the prior distributions speci- 
fied when in fact the true model is given by and the true value of 

the parameters is given by p*. ASN gives the average sample number upon 
termination. 

The above algorithm can be easily used for either large sample or small 

sample studies . For example, for large sample studies set 8 = 1-0 and 

^MAX t0 some large number, say 100 or 500. For small sample studies set 

6 . < 1.0, J.,.„ to some small number, and N to some larger number, say 

min MAX 

500 or 1000. The following studies are some of the more interesting re- 
sults from reference 1. 

Large Sample Studies 

In this section we examine the large sample properties of the poster- 
ior probabilities of the models and the posterior means of the parameter 
distributions. Two sets of nested polynomial models are studied. The 
posterior probabilities of each model, the posterior means of the param- 
eter distributions, and the proportion of times each of the allowable 
values of the independent variable is chosen as optimal are tabulated for 
simulations of 100 and 500 observations. 

The two sets of nested polynomial models have the following general 


form: 
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l-l 

H 0 : y = . r. 3.x J 4- e, l = 1,L 

j=0 J 


Two values of L are studied, and for each of these choices, two choices 
of are made. The values of x, 0^ and q are specified as 


x = 100.0 


¥ =T 

SL, 0 


S,,0 L 

for all simulations. The values of y are tabulated at the tops of 

Xi j u 

figures 1 and 2 and the resulting functions are graphed on the interval 
xe [-1,4-1] at the bottoms of the respective figures. The value of x 


represents a quantity known by the statistician and nature while V ^ q, 
y^ o and 0^ ^ represent the statisticians prior information. For 
L = 4, the two choices of H^* are and H^. For L = 6, the two 

choices of H^* are and H,-. For simplicity, the actual values of 

y^* q were chosen to be the values of the parameters used to generate 
the data for each of the four cases. That is, y^* q = y . 

For these simulations, the definition of A was arbitrarily taken 


to be 



i = 0. 



Note that sampling is strictly one observation per stage. 

The simulation results are summarized in table 1 and given in further 
detail in tables 2 through 9. For each choice of L and £*, five simu- 
lations of 100 observations and five simulations of 500 observations were 
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performed. For the simulations of 500 observations, reporting the re- 
sults for the first 100 observations thus gives results for a total of 
10 simulations with 100 observations. For these simulations, the sample 

paths of the’ 0. . were printed out and the choice of a^^ at each 
~ > J 

stage were printed. The posterior means of the parameter distributions 
were printed only after the last stage. Tables 2, 4, 6, and 8 give the 
posterior probabilities after 100 observations and the first 100 out of 
500 observations . The proportions p_^ of using a^^ are also given. 
Tables 3, 5, 7, and 9 give the same information for the 500 observation 
simulations . 


Figures 3 and 4 present typical sample paths for the posterior prob- 
ability of the correct model. In figure 3, the value of 0„ . is plot- 

* >1 

ted for the first 250 observations of the third simulation for L = 4 

JL 

and l =2. In figure 4, the value of 0_ . is plotted for the first 

d > J 

* 

250 observations of the first simulation for L = 4 and l =3. These 

figures illustrate the typical behavior of 0 * .. It fairly rapidly 

* >J 

rises to a value of about 0.85 to 0.95 and then slowly and erratically 
oscillates. As discussed in reference 1, this osciallation is suspected 
to be because of the nested nature of the model equations. 


For L = 4, consideration of tables 2 to 5 show that the Euclidean 


distance of y. 


from the vector 


decreases with j for values 


of £ greater than l . This is in accord with conclusions in chap- 

* 

ter 3 of reference 1. For L = 6 and £ = 3 we again see the same 


behavior as evidenced by tables 6 and 7. However, for £ - 5 , an en- 


tirely different situation arises. To understand this we should note 
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that the model used to generate the sequential observations is 


y=0.5x+0.1x + e 


This function can be very closely approximated by a model of the form 


y = ax + bx + e 


over the range of x values considered. And in fact we note from tables 

8 and 9 that there is a marked preference for choosing the lower degree 

model as indicated by 0 becoming close to 1.0. It is also interest- 

3 > J 

ing to note the behavior of y„ . for £ > 3. We do not, in general, 

* ~ 
see that y . -M ,J J as might be expected when H is so close to 

\ 0 / J 

being true, except for the case of £ = 4. For y^ we note that the 

3 

average posterior mean of the coefficient of x is quite close to zero 

2 4 

and the sum of the posterior means of the coefficients of x and x 

is quite close to 0.1. For y, we note that the sums of the posterior 

o 

2 4 

means of the coefficients of x and x are close to 0.1 and the sum 

3 5 

of the posterior means of the coefficients of x, x , and x is close 
to 0.5. From these simulation studies it is not clear whether this be- 
havior is simply because 500 observations is not a sufficiently large num 
ber to discriminate well between such nearly equivalent functions or if 
this behavior will persist no matter how large the number of observations 
We now turn to a discussion of the observed proportions of times the 


(i) 


were chosen as the optimal experiments. From tables 2 and 3 which 


present the results of L = 4 and £* = 2 we see that the largest p^ 
are for Pq, p^, p^, and p^. These correspond to x = -1, x = -1/9, 
x= +1/9, and x = +1. Because of the discretization of the interval 
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(-1,4-1) we might assume that the asymptotically most informative experi- 
ments were x = -1, x - 0, and x = 4-1. From tables 4 and 5 we see the 
largest p^ are p^, p^, Py, and p^ corresponding to x = -1, x = -5/9, 
x = 4-5/9, and x = 4-1. The relationship of these proportions and x points 
to the experimental designs which are optimal from other considerations 
might be interesting . For example, Kiefer and Wolfowitz (ref. 14) consider 
optimal designs for regression problems of a somewhat different nature. 

The comparison of the current results with such other works is currently 
being pursued but will not be reported at this time. 

Small Sample Performance Studies 

In this section we examine the performance of the proposed sequential 
procedure as measured by the PCS and ASN values. Two studies are pre- 
sented of the problem of discriminating among the three models 

H l : y = ^1 X 1 + E 

H 2 : y = S 2 x 2 4- e 

H 3 : y = ^1 X 1 + ^2 X 2 + E 

The first study assumes is true and the second study assumes H 2 is 

true. The experiment space A is defined as 

A = {(x^,x 2 ): x_^ = ±1; one-at-a-time sampling} 
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Study One - Assumed True 
We study discriminating among 

H x : y = 3 - l x 1 + e 

V y = e 2 X 2 + £ 

H 3 : y = 0 1 X 1 + e 2 X 2 + e 


where 


and 


A = {(x^,X2): x^ = ±1; one-at-a-time sampling} 


T = I 

£,0 


H,0 3 


y l,0 ^ 1, °^ p 2,0 



Then a number of simulation experiments were performed for each com- 
bination of: 


and 


T 


0 . 
min 

J MAX 


0.50, 1.0, 2.0 
0.70, 0.80, 0.90 
8, 16 


'0.0\ / 0 . 50\ /1.0\ /l . 5 ' 


U 


3,0 


^0.0/ \0.50/ \1.0/ \1 . 5/ 

The experiments for = ® used 1500 simulations and for =16 


used 1000 simulations 
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Tne choice of prior means deserves some comment. Figure 5 illus- 
trates the points in (3 j^) coor dinate space corresponding to the prior 
means. The points corresponding to p^ q and q are as close to p 

as possible since y^ ^ is restricted to the horizontal axis and q 

to the vertical. The four choices for y _ then span a range about y 

3 > U 

and hence the resulting PCS and ASN values will indicate the importance 
of mis-specified prior means. 

Tables 10 and 11 present the observed PCS and ASN values for the 

combinations of 0 . , t and y_ . . 

min , 3,0 

In general, the results are about what should be expected. The PCS 


3,0 


increases with t and ASN decreases with t. PCS increases as y 

gets closer to y . We also note that in most cases, PCS increases with 

0 . for fixed t and y. A . For some values of r, however, the value 
min 3,0 

of PCS increases and then decreases as 0 . increases. This is par- 

min r 

* 

ticulary apparent when y n = y . There does not seem to be any ready 

J , u 

explanation for this . 

Study two ~ ^2 Assumed True 

A much less extensive study of this case was made than the case of 
assumed true. The same model equations were postulated and we assume 






£ ,0 


1, • 


,L 


* 1,0 = (0 ' 0) 


y, = (0.0, 1.0) ' 


3,0 

Z* - ( 1 . 0 ) 
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The values of t, 0 . , and u„ _ which were simulated are tabulated in 

mm 2,0 

table 12 along with the simulation results. Figure 6 illustrates the 

J 

prior means. Only one level of (=8) was considered. Also, only 

500 simulations were performed for each of these cases. The results are 
generally the same as for true. 

DISCUSSION OF RESULTS 

We now make some general observations concerning the results of the 
simulation experiments. 

First, consider the large sample results. In the context of the fact 
that sequential procedures are primarily developed in the hope that reli- 
able decisions can be made with small samples rather than large samples, 
these results are not of primary importance. It is interesting and in- 
formative to know, however, that the procedures are consistent. Since the 
study of limiting posterior distributions resulting from sequentially 
chosen experiments is known to be an extremely difficult and delicate prob- 
lem, simulation experiments may be helpful by indicating to researchers 
what large sample behavior is likely to be true. 

In the problems studied in reference 1 it seems quite likely that when 
non-nested models are encountered, the posterior probability of the true 
hypothesis has a limiting value of unity. For those non-nested models, 
the posterior mean of the parameters of the true hypothesis seemed to con- 
verge to the values of the unknown parameters generating the data. 

When nested models are encountered, however, the results are not as 
enlightening. It appears that if the posterior probability of the correct 
hypothesis does not achieve a limit of unity, it at least attains a large 



33 


value (in the range of 0.85 to 0.95) and then randomly fluctuates about 
that value. There is indication that the conjecture of Box and Hill that 
for certain nested models there is a distinct preference by the sequen- 
tial procedure to choose the model with the smaller number of parameters 

* 

is true. For instance, the polynomial study L = 6, £ =5 indicates 

that if a model with more parameters is true but can be approximated closely 
by one with fewer parameters, there is a preference for the smaller model. 

This point raises another question which is especially important as 
regards nested models. Although these simulation results appear to sup- 
port the observation of Box and Hill, that the posterior probability of 
the correct model rather rapidly gets close to unity (in the range of 0.85 
to 0.95), it is not clear that it ever does attain unity. In fact, it is 
quite possible that eventually 0^* fluctuates about some value which may 
be a function of L, the numbers of parameters in the models, and the 

space A. This would have quite a bit to do with the choice of 0 . . 
r M min 

Too large a value would cause excessive (perhaps even infinite) sample 
sizes . 

In examining the small sample performance simulation experiments, it 
is seen that PCS drops off fairly rapidly as the distance of the prior 
mean of the correct model from the true values of the parameters increases. 
This supports the conjecture of Chernoff and Meeter et al. that there may 
often be "initial bungling". It should be noted, however, that in all 
cases studied, the prior means of the competing models were all set to be 
as close to the true model parameter values as could be done. Thus, in a 
sense, these experiments can be considered to be presenting the most 
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unfavorable situation possible to the sequential procedure. In actual 
application it might be more reasonable to assume that the prior distri- 
butions of all the models are mis-specif ied to the same extent. This 
problem of "initial bungling" should also indicate that the statistician 
should have the prior precision matrices of the parameter distributions 
be as vague as the prior information permits. 

One approach studied by Kiefer and Sacks (ref. 7) was to plan small 
initial experiments as a basis for gaining information to plan a large 
second experiment. An alternative not studied in this report, but which 
seems worthy of investigation, would be to set a lower limit, say 
as the minimum number of observations taken before a stopping rule is ap- 
plied. The sequential procedure would use the same rule as developed for 
selection of experiments but large posterior probabilities on the models 
would be ignored until a sufficient number of observations are taken to 
avoid the consequences of initial bungling. This also makes sense from 
the point of view of obtaining parameter estimates. Surely an experi- 
menter would not be content to terminate sampling with two or three ob- 
servations even if the resulting probabilities are overwhelmingly in favor 
of one hypothesis unless he had extremely good prior information. 
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A 

A. 

J 


ASN 


a 



a . 
3 

* 

a 


B 

E{X} 

e{x|y} 


APPENDIX 
LIST OF SYMBOLS 

the space of allowable experiments 

the space of allowable experiments requiring exactly j 
observations 
average sample number 
element of A 
the i*"* 1 experiment A 

experiment in A performed at the j ^ stage of sampling 
optimal experiment in A 

vector of parameters appearing in combined model equations 
expectation of the random variable X 

conditional expectation of the random variable X given 
the value of Y 


Aw) entropy of the probabilities at state w 

^[w(y),a] entropy of the posterior probabilities if system is in 


f £ (y|a,a) 

f £ (y|a) 

f 2 (?. +1 |a,a) 


state w and the value y is observed 
density function of y under model l when a is given 
and experiment a is to be performed 
marginal density function of y under model l when ex- 


periment a is to be performed 
density function of Xj+]_ under model & when a and a 
are given 


marginal density function of y_._^ under model i when 


a is given 
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H £ 

I 

I(w,a) 

I(w.a,i,j) 


MAX 


</(w , a) 

K 
£ 

L 
£ sl 
M 

M £ 

M *,j 

N(p,T) 




N 


n 


N. 

J 


n 


n . 

i 

PCS 


p i 


denotes hypothesis £ about the form of the model equation 
identity matrix 

expected information in experiment a when state of system 
is w 

expected information for discriminating in favor of PL 
against in experiment a when state of system is w 

matrix of I(w,a,i,j) 

upper limit on total number of observations 
number of controlled variables 

subscript denoting model equation (£ = 1, . . . ,L) 
number of model equations or hypotheses postulated 
true model equation subscript 
design matrix 

design matrix for model £ 

design matrix for model £ under experiment a^ 
normal distribution with mean vector y and precision 
matrix T 

number of simulations for Monte-Carlo study 
counter on simulations performed in algorithm 
counter or number of times sampling terminates with j ob- 
servations in algorithm 
vector of n^ 

number of observations taken at stage i 
probability of correct selection 

proportion of times a^ performed in a sequence of 
experiments 
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A,j 

R(w,a) 


A.j 


w 


precision matrix of distribution of y under model I 
expected reduction in entropy if experiment a is per- 
formed and state is w 

mean vector of distribution of y. under model 2. 

3 

precision matrix of distribution of e 
state of sampling system defined by values of 


X 


x. , 
i,k 

y. y 

*>■ 

a. 


vector of x, 


B 

3 

K 

z 


k 

a) 


A.J 


min 

* 


y 


*»j 




th 


value of at i observation 

observed variable 

vector of parameters in model equation £ 
coefficient of x^ 

coefficient of x, in model £ 
k 

vector of observation errors 
probability model £ is correct 

posterior probability that model £ is correct after 
stages of sampling 
probability stopping criterion 

mean vector of distribution of parameters in model £ 
values of parameters of the true model 

mean vector of distribution of parameters in model £ 
j stages of sampling 

density function of parameters in model £ after j 


after 


stages 


of sampling 
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precision of distribution of e 

precision matrix of distribution of parameters of model % 
after j stages of sampling 
vector of zeros 
determinant of a matrix 


distributed as 
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APPENDIX C 
TABLES 


TABLE 1. - SUMMARY OF SIMULATION RESULTS PRESENTED IN TABLES 2 THROUGH 9 


Model 

number 

Parameter 


m 

L = 4, 

* 

i =3 

mm 

mm 

L = 6, 

* 

Z = 5 



100 

500 

100 

500 

100 

500 

100 

500 



obs 

obs 

obs 

obs 

obs 

obs 

obs 

obs 


6 1 

0 

0 

0 

0 

0 

0 

0 

0 


®2 

.966 

.983 

0 

0 

0 

0 

.010 

0 


®3 

.032 

.016 

.922 

.962 

.860 

.877 

.902 

.941 



.002 

.001 

.078 

.038 

.109 

.107 

.062 

.023 







.022 

.012 

.017 

.029 

U 

6 6 





.009 

.004 

.009 

.006 

i 

B o 

0.0971 

0.0891 

0.1322 

0.1377 

0.1116 

0.1357 

0.0278 

-0.0240 

2 

B n 

0.1010 

0.0951 

0.1313 

0.1384 

0.1253 

0.1286 

0.0316 

0.0382 



.4981 

.5025 

.2485 

.2522 

.2411 

.2544 

. 5114 

.4977 

3 

Bo 

0.1013 

0.0941 

-0.0070 

-0.0067 

-0.0038 

-0.0015 

-0.0249 

-0.0099 


8? 

.4981 

.5025 

.2478 

.2525 

.2493 

.2505 

.5086 

.5032 


6 2 

-.0007 

.0021 

.2578 

.2634 

.2157 

.2544 

.1179 

.0981 

4 

8 n 

0.1010 

0.0943 

-0.0070 

-0.0335 

-0.0035 

-0.0015 

-0.0237 

-0.0097 


i 

.5029 

.4915 

.2497 

.2416 

.2572 

.2429 

.5168 

.5046 


-.0004 

.0018 

.2579 

.2634 

.2645 

.2544 

.1168 

.0976 


6 3 

-.0076 

.0112 

-.0026 

.0146 

-.0089 

.0102 

-.0083 

-.0017 

5 






0.0040 

0.0015 

-0.0157 

0.0001 


B? 





.2564 

.2440 

.5120 

.5009 


Bo 





.2222 

. 2577 

-.0020 

.0227 


83 





.0032 

.0088 

-.0048 

.0024 


*4 





.0360 

-.0015 

.0908 

.0730 

6 

B 0 





0.0030 

-0.0037 

-0.0180 

-0.0021 


£ 





.2390 

.2559 

.5057 

.4638 







.2291 

.2619 

.0603 

.0376 


b 3 2 





.0596 

-.0373 

.0438 

.1340 


6 4 





.0301 

-.0050 

.0553 

.0598 


4 





-.0534 

.0345 

-.0413 

-.0969 


The column headings give the values of L and 1* and the number of observations. The row headings 
present the parameters whose average posterior values are given. The probabilities listed for 100 ob- 
servations are the averages after five simulations of 100 observations and the values after the first 
100 observations of the 500 observation simulations. The averages of the posterior parameter means 
are based only upon the five full simulations of 100 and 500 observations, respectively. The posterior 
probabilities for 500 observations are based upon five simulations of 500 observations each. 











































































TABLE 2. - L = 4 , £* = 2 


Model 

Param 

After 100 observations 

After first 100 of 500 observations 

mm 

8 i 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

u 

®2 . 

.973 

.979 

.974 

.976 

.975 

.976 

.976 

.931 

.977 

.923 

U 

°3 

.025 

.019 

.024 

.023 

.024 

.023 

.023 

.063 

.022 

.071 

mm 

0 4 

.002 

.001 

.002 

.001 

.002 

.001 

.001 

.006 

.001 

.006 

m 

s 0 

0.0795 

0.1017 

0.0906 

0.1271 

0.0865 

* 


* 

* 

* 

2 


0.1187 

0.1017 

0.0753 

0.1032 

0.1059 

* 

■ 

■ 


mm 


Si 

.5192 

.5022 

.4935 

.4892 

.4865 


M 

1 


Ml 

3 


0. 1263 

0.1021 

0.0682 

0.1067 

0.1033 

; V: * 



■ 



Si 

.5191 

.5021 

.4935 

.4894 

.4866 

1 



■ 



& 2 

-.0152 

-.0008 

.0141 

-.0069 

.0055 

IBB 

II 

■ 

l 

1 

mm 

So 

0.1239 

0.1022 

0.0688 

0.1059 

0.1041 



■ 

n 

■ 

BK 

6 1 

.4858 

.5044 

.4771 

.5186 

.5288 

9KwB 






s 2 

-.0126 

-.0010 

.0133 

-.0051 

.0032 





■ 

■1 

$ 3 

.0351 

-.0024 

.0170 

-.0350 

-.0527 

1 

mm 

m 

is 

yi 


19 

0.25 

0.23 

0.23 

0.17 

0.19 

0.18 

0.23 

0.27 

0.21 

0.25 

IfSlSpl 

mm 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 


3fl| 

.05 

.05 

.03 

. 17 

.20 

.24 

.06 

.04 

.02 

.01 


mm 

.02 

.03 

.02 

.07 

0 

.02 

.01 

.01 

.03 

0 


Pi 

.35 

.20 

.14 

.01 

.17 

.04 

.26 

.43 

.11 

.25 


mm 

.06 

.20 

.27 

.14 

.06 

.12 

.14 

.01 

.19 

.24 


Ba 

.01 

0 

.06 

.09 

.02 

.03 

.01 

.01 

.17 

0 

■ 

|| 

.05 

.05 

0 

.13 

.21 

.19 

.06 

.02 

.01 

.01 


n 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 


mm 

.21 

.24 

.25 

.22 

.15 

.19 

.23 

.21 

.26 

.24 


* 

Not recorded. 


The values of the posterior probabilities and parameter means after ten simulations, of 
100 observations each, of the sequential selection procedure. The last five columns are 
data from the first 100 observations of the 500 observation simulations tabulated in 
table 3. The posterior means were not recorded for these cases. Also listed are the 
proportions p^ of the times each a(*) was chosen as the optimal experiment. 
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TABLE 3 

. - L = 

/. p* _ 

*-+5 36 — 

2 


Model 

Param 


After 500 observations 


1 

G 1 

0 

0 

0 

0 

0 

2 

0 2 

.991 

.985 

.990 

.991 

.957 

3 

6 3 

.009 

.015 

’ .009 

.009 

.040 

4 

0 4 

0 

0 

0 

0 

.003 

1 

e o 

0.0875 

0.0599 

0.0905 

0.0757 

0.1317 

2 

e o 

0.0921 

0.0984 

0.0999 

0.0942 

0.0909 


6 1 

.4964 

.5010 

.5028 

.5014 

.5108 

3 

e o 

0.0923 

0.1032 

0.0984 

0.0937 

0.0827 


e l 

.4964 

.5010 

.5028 

.5014 

.5108 


S 2 

-.0005 

-.0096 

.0029 

.0012 

.0163 
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TABLE 3. - Continued. 

Model Param After 500 observations 


4 *0 

0.0923 

0.1022 

0.0985 

0.0935 

0.0805 

e l 

.4948 

.4886 

.5043 

.4882 

.4817 

*2 

-.0004 

-.0086 

.0028 

.0014 

.0139 

*3 

.0016 

.0126 

-.0017 

.0141 

.0293 

P 0 

0.234 

0.264 

0.236 

0.236 

0.228 

P 1 

0 

0 

0 

0 

.002 

p 2 

.050 

.010 

.056 

.044 

0 

P 3 

.008 

0 

.004 

.008 

0 

P 4 

.280 

.422 

.302 

.306 

.076 

P 5 

.HO 

.060 

.070 

.088 

.418 

P 6 

.018 

0 

.026 

.076 

.006 

P 7 

.072 

.010 

.088 

.038 

.002 

P 8 

0 

0 

0 

0 

0 

P 9 

.228 

.226 

.218 

.204 

.268 


The values of the posterior probabilities and parameter 
means after 5 simulations, of 500 observations each, of 
the sequential selection procedure. Also listed are 
the proportions p^ of the times each a^^ was chosen 
as the optimal experiment. 
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TABLE 4. - L = 4, l* = 3 


Model 

Faram 

After 100 observations 

After first 100 of 500 observations 

n 

e i 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 


e 2 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

■B 

S 3 

.788 

.828 

.967 

.966 

.966 

.962 

.916 

.966 

.941 

.920 

mm 

\ 

.212 

.172 

.033 

.034' 

.034 

.038 

.084 

.034 

.059 

.080 

a 

6 0 

0.1140 

0.1548 

0.1260 

0.1292 

0.1368 

* 

* 

* 

* 

k 

2 

e 0 

0.1183 

0.1515 

0.1255 

0.1286 

0.1324 

* 

k 

* 

* 

* 


Bl 

.2452 

.2533 

.2385 

.2575 

.2482 






3 

^0 

-0.0043 

-0.0089 

-0.0159 

0.0104 

-0.0162 


m 

■ 

■ 



Bi 

.2467 

.2510 

.2389 

.2552 

.2470 

■ 






6 2 

.2263 

.2939 

.2683 

.2216 

.2788 



1 



| 

6 0 

— 

-0.0045 

-0.0091 

-0.0158 

0.0105 

-0.0162 

* 

* 

* 

* 

* 


B? 

.1838 

.3095 

.2395 

.2633 

.2523 







s 2 

.2268 

.2945 

.2681 

.2215 

.2788 






■ 

4 

.0833 

-.0779 

-.0009 

-.0107 

-.0070 






MU 

mm 

0.18 

0.17 

0.17 

0.17 

0.17 

0.18 

0.17 

0.17 

rwm 

0.17 



0 

0 

0 

0 

0 

0 

0 

0 

EM 

0 



.32 

.31 

.30 

.30 

.29 

.32 

.30 

.30 


.30 


H: 

0 

.01 

0 

.02 

0 

0 

0 

0 


.02 


Pa 

0 

0 

.03 

0 

.02 

0 

.03 

.02 

0 

0 


p 5 

.03 

.01 

.02 

.03 

.04 

.02 

.02 

.02 

.01 

.05 



0 

0 

.01 

.02 

0 

.01 

0 

0 

0 

.02 

■ 

P7 

.30 

.32 

.30 

.28 

.30 

.30 

.31 

.31 

.32 

.28 


p 8 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

I 

p 9 

.17 

.18 

.17 

.18 

.18 

.17 

.17 

.18 

.18 

.16 


The values of the posterior probabilities and parameter means after 10 simulations, of 100 ob- 
servations each, of the sequential selection procedure. The last 5 columns are data from the 
first 100 observations of the 500 observation simulations tabulated in table 5. The posterior 
means for these 5 cases were not tabulated. Also listed are the proportions p^ of the times 
each a(t) was chosen as the optimal experiment. 
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TABLE 5. -L=4, £*=3 


Model 

Param 


After 

500 observations 


1 

6 1 

0 

0 

0 

0 

0 

2 

9 2 

0 

0 

0 

0 

0 

3 

0 3 

.953 

o 982 

.953 

.969 

.954 

4 

0 4 

.047 

.018 

.047 

.031 

.046 

1 

e o 

0.1368 

0.1385 

0.1383 

0.1394 

0.1354 

2 

e o 

0.1376 

0.1398 

0.1391 

0.1387 

0.1369 



.2561 

.2463 

.2608 

.2550 

.2426 

3 

e o 

-0.0227 

0.0085 

-0.6069 

-0.0028 

-0.0096 



.2566 

.2469 

.2611 

.2546 

.2434 


e 2 

.2906 

.2392 

.2648 

.2548 

.2677 
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TABLE 5. - Continued. 

Model Param After 500 observations 


e o 

- 0.0227 

0.0085 

- 0.0069 

- 0.0028 

- 0.0096 

e l 

.2355 

.2385 

.2399 

.2715 

.2225 

g 2 

.2905 

,2392 

.2649 

.2548 

.2678 

3 3 

.0281 

.0112 

.0281 

-.0223 

.0277 

p o 

0.178 

0.178 

0.178 

0.178 

0.178 

P 1 

0 

0 

0 

0 

0 

P 2 

.322 

.320 

.320 

.318 

.318 

P 3 

0 

0 

0 

.002 

.004 

P 4 

0 

.006 

,004 

0 

0 

P 5 

.004 

.004 

.004 

.002 

.010 

P 6 

.002 

0 

0 

0 

.004 

P 7 

.318 

.318 

.318 

.320 

.312 

P 8 

0 

0 

0 

0 

0 

Pq 

.176 

.174 

.176 

.180 

.174 


The values of the posterior probabilities and parameter means 
after 5 simulations, of 500 observations each, of the sequen- 
tial selection procedure. Also listed are the proportions 
p_^ of the times each a^ 1 ^ was chosen as the optimal 


experiment 
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TABLE 6. - L = 6, l* = 3 


Model 

Param 

After 100 observations 

After 

first 100 of 500 

observations 


6 1 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 


0 2 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 


9 3 

.9554 

.9564 

.7630 

.9098 

.9432 

. 2733 

.9534 

.9471 

.9449 

.9554 


e; 

.0378 

.0365 

.1756 

.0738 

.0469 

.5534 

.0400 

.0444 

.0457 

.0378 

5 

6 5 

.0052 

.0055 

.0451 

.0130 

.0075 

.1168 

.0050 

.0070 

.0075 

.0052 

6 

0 6 

.0016 

.0017 

.0162 

.0034 

.0024 

.0564 

.0016 

.0015 

.0019 

.0016 

a 

6 0 

0.1567 

0.1026 

0.0891 

0.0839 

0.1259 

* 

* 

* 

' 

* 


2 


0.1298 

0.1197 

0.1192 

0.1118 

0.1462 




■ 

H 


6 i 

.2696 

.2258 

.2151 

.2422 

.2527 

WM 

1 1 

1 

1 

1 

3 

e o 

-0.0114 

0.0098 

-0.0332 

0.0004 

0.0156 


■■■ 

■ 

■ 

■ 



.2564 

.2354 

.2396 

.2542 

.2607 

■ 






$ 2 

.2882 

.2282 

.3290 

.2329 

.2463 

1 1 

1 i 



1 

1 

^0 

-0.0110 

0.0098 

-0.0317 

-0.0008 

0.0161 

* 

* 

* 

* 

* 


6° 

.2421 

.2427 

.3033 

.2117 

.2864 







b 2 

.2877 

.2282 

.3268 

.2346 

.2453 






1 


.0189 

-.0099 

-.0846 

.0562 

-.0341 






5 


-0.0056 

0.0200 

-0.0082 

-0.0175 

0.0314 

* 

* 

* 

* 

* 



.2458 

.2403 

.2870 

.2232 

.2855 







6 2 

.2562 

.1692 

.1788 

.3427 

.1641 







e 3 

.0148 

-.0080 

-.0676 

.0432 

.0336 







®4 

.0271 

.0512 

.1288 

-.0966 

.0694 






6 

e o 

-0.0061 

0.0209 

-0.0130 

-0.0166 

0.0300 

* 

* 

* 

* 

* 


8 1 

.2429 

.1945 

.3250 

.2050 

.2278 








.2583 

.1767 

.1921 

.3418 

.1767 







h 

.0265 

.1913 

-.2368 

.1214 

.1958 







h 

.0254 

.0432 

.1203 

-.0968 

.0584 







*5 

-.0089 

-.1555 

.1320 

-.0609 

-.1739 







p 0 

0.13 

0.15 

0.18 

0.17 

0.17 

0.18 


0.11 

0.17 

0.13 


Pi 

.02 

.08 

.03 

.06 

.10 

.01 

.04 

.11 

.12 

.02 


p 2 

.17 

.24 

.32 

.27 

.22 

.32 

.12 

.12 

.20 

.17 


p 3 

.08 

.03 

0 

0 

.02 

0 

.05 

.02 

.02 

.08 


p/. 

.06 

.03 

.02 

.07 

.03 

.01 

.07 

.14 

.07 

.06 



.03 

.03 

.07 

.03 

.05 

0 

.11 

.02 

.04 

.03 



o 

.12 

.14 

.14 

.02 

.06 

.02 

.06 

.07 

0 



.29 

.17 

.11 

.09 

.21 

.26 

.16 

.15 

.15 

.29 



.05 

.02 

.01 

.05 

.04 

0 

.16 

.12 

.04 

.05 



.17 

.13 

.12 

.12 

.14 

.16 

.17 

.15 

.12 

.17 


Not recorded. 

The values of the posterior probabilities and parameter means after 10 simulations, of 100 obser- 
vations each, of the sequential selection procedure. The last 5 columns are data from the first 
100 observations. of the 500 observation simulations tabulated in table 7. The posterior means 
were not recorded for these 5 cases. Also listed are the proportions of the times each - was 

chosen as the optimal experiment. 
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TABLE 7. - L = 6, £* = 3 


Mod el 

Param 

After 500 observations 

n 

01 

0 

0 

0 

0 

0 


Q 2 

0 

0 

0 

0 

0 


®3 

. 6046 

.9812 

.9722 

.9746 

.8526 


8 4 

.3388 

.0175 

.0257 

.0230 

.1316 

1 

0 5 

.0425 

.0009 

.0018 

.0021 

.0125 

KS 

0 6 

.0141 

.0003 

.0003 

.0003 

.0032 

i 

0Q 

0.1321 

0.1378 

0.1325 

0.1349 

0.1413 

2 


0.1356 

0.1278 

0.1247 

0.1168 

0.1383 


h 

.2434 

.2616 

.2541 

.2549 

.2581 

3 

0 O 

- 0.0026 

0.0067 

- 0.0066 

- 0.0021 

- 0.0029 


01 

.2446 

.2556 

.2486 

.2466 

.2571 


b 2 

.2542 

.2336 

.2605 

.2615 

. .2622 


0 O 

- 0.0027 

0.0067 

- 0.0067 

- 0.0017 

- 0.0030 


% 

.2076 

.2515 

.2340 

.2341 

.2872 


0 2 

.2544 

.2335 

.2608 

.2609 

.2623 


03 

.0491 

.0055 

.0194 

.0167 

-.0399 

5 

B 0 

- 0.0113 

0.0077 

0.0111 

0.0063 

- 0.0062 



.2081 

.2518 

.2332 

.2394 

.2873 


6 2 

.2926 

.2279 

.2845 

.2061 

.2774 


0 3 

.0486 

.0052 

.0203 

.0101 

-.0401 


8 4 

-.0300 

.0049 

-.0200 

.0496 

-.0121 

6 

0o 

- 0.0118 

0.0047 

- 0.0100 

0.0061 

- 0.0073 


6 1 

.2164 

.3128 

.2432 

.2386 

.2682 


&2 

.2949 

.2455 

.2801 

.2070 

.2819 


03 

.0130 

-.2307 

-.0225 

.0134 

.0405 


04 

-.0317 

-.0098 

-.0168 

.0490 

-.0155 


05 

.0270 

.1773 

.0328 

-.0026 

-.0618 


PO 

0.178 

0.136 

0.148 

0.116 

0.168 


p l 

.004 

.064 

.028 

.070 



P 2 

.314 

.206 

.198 

.086 

.282 


P3 

.004 

.036 

.092 

.112 

.018 


P4 

.010 

.014 

.038 

.110 

.014 


p 5 

0 

.092 

.014 

.024 

.016 


P6 

.022 

.004 

.012 

.022 

.002 


P7 

.296 

.190 

.280 

.288 

.304 


P8 

0 

.104 

.026 

.014 

.010 


P9 

.172 

.154 

.164 

.158 

.174 


The values of the posterior probabilities and parameter 
means after five simulations, of 500 observations each, of 
the sequential selection procedure. Also listed are the 
proportions of the times each ad) was chosen as the opti- 
mal experiment. 
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TABLE 8. - L = 6, l* = 


Model 

Param 

After 100 observations 

After first 100 of 500 observations 


®1 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 


®2 

0 

0 

0 

.021 

.042 

0 

0 

.011 

.003 

.026 


®3 

.945 

.942 

.956 

.895 

.848 

.943 

.877 

.852 

.904 

.857 


®4 

.043 

.043 

.038 

.063 

.042 

.048 

.106 

.089 

.070 

.076 

5 

®5 

.007 

.012 

.005 

.015 

.033 

.007 

.013 

.035 

.018 

.028 

6 

*6 

.004 

.003 

.001 

.006 

.035 

.002 

.004 

.013 

.006 

.013 

1 


0.1502 

-0.0299 

0.0231 

-0.0079 

0.0034 

lit 

* 

* 

* 

* 

2 

mm 

0.0356 

0.0189 

0.0288 

0.0316 

0.0431 

* 

* 

* 

* 

it 


H 

.5159 

.5101 

.5123 

.5079 

.5106 






3 

6 0 

-0.0348 

-0.0413 

-0.0412 

-0.0098 

0.0026 

■Q| 


■■ 

■ 

■ 


6 1 

.5040 

.5077 

.5096 

.5084 

.5133 



■ 

■ 



B 2 

.1467 

.1265 

.1478 

.0837 

.0850 

■ 

ii 

■ 

■ 

II 

M ■ 

B 0 

-0.0333 

-0.0396 

-0.0414 

-0.0070 

0.0026 


JBgH 

H^H 


agrai 


B? 

.4874 

.5288 

.5019 

.5513 

.5146 

■ 




■ ■ 


8 2 

.1450 

.1252 

.1478 

.0810 

.0849 


SHiiS 



HI 

1 

B 3 

.0217 

-.0261 

.0099 

-.0456 

-.0016 


II 

1 

1 

1 

5 

e o 

-0.0232 

-0.0242 

-0.0431 

-0.0041 

0.0161 

■ 


■ 

H^Hj 

BB1H 



.4985 

.5201 

.5039 

.5516 

.4861 







B 2 

.0642 

-.0063 

.1616 

.0184 

-.1260 





HI 


B 3 

.0101 

-.0185 

.0074 

-.0463 

.0235 



1 




B 4 

.0774 

.1223 

-.0131 

.0605 

.2068 

|| 

■ 

BB 

■ 

|| 

6 

B 0 

-0.0298 

-0.0234 

-0.0435 

-0.0067 

0.0135 





mmm 


B ? 

.6005 

.5398 

.5233 

.5143 

.3504 


■ H 





$2 

.1187 

-.0120 

.1655 

.0517 

-.0226 


■ 

■ 


HI 


e 3 

-.3886 

-.1048 

-.0708 

' .1797 

.6034 

la||U 

■ H 


^^H 

HHH 


B 4 

.0292 

.1270 

-.0167 

.0301 

.1071 



HI 


H 


B 5 

.3037 

.0673 

.0599 

-.1893 

-.4482 



1 

■ 

pBai 

■ 


0.10 

0.16 

0.14 

0.24 


0.18 

0.14 

0.17 


0.22 

■ 


.05 

.06 

.07 

.02 

.13 

.25 

.04 

.02 


.01 

■ ■ 

H$H| 

.08 

.20 

.15 

.03 

.03 

.06 

.17 

.04 


.02 

1 

H$B 

.04 

.05 

.07 

.07 

.02 

.01 

.06 

.01 


0 

H '• Mf, 

p 4 

.08 

.13 

.08 

.26 

■ .18 

.16 

.06 

.10 

.10 

.05 


p 5 

.08 

.08 

.10 

.12 

.19 

.07 

.17 

.25 

.06 

.41 

m ■ 

p 6 

.08 

.04 

.06 

.02 

.03 

.03 

0 

0 

.14 

.01 


P7 

.11 

.10 

.09 

.03 

.05 

.09 

.19 

.19 

.05 

.02 

H 

p 8 

.22 

.02 

.10 

0 

.01 

.06 

.03 

.01 

.01 

0 

m 

p 9 

.16 

.16 

.14 

.21 

.18 

.09 

.14 

.21 

.17 

.26 


Not recorded. 


The values of the posterior probabilities and parameter means after 10 simulations, of 100 ob- 
servations each, of the sequential selection procedure. The last 5 columns are data from the 
first 100 observations of the 500 observation simulations tabulated in table 9. Also listed 
are the proportions of the times each a W was chosen as the optimal experiment. 
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TABLE 9. - L « 6, i.* = 5 


Model 

Param 

After 500 observations 


0 1 

0 

0 

0 

0 

0 


8 2 

0 

0 

0 

0 

0 


0 3 

.974 

.882 

. .899 

.976 

.976 


S 4 

.020 

.024 

.029 

.021 

.022 

5 

®5 

.003 

.075 

.062 

.002 

.002 

6 

0 6 

.002 

.020 

.009 

0 

0 

1 


-0.1051 

' j 

-0.0926 

0.0130 

0.0390 

2 

e o 

0.0290 

0.0373 

0.0436 

0.0351 

0.0458 


*1 

.4860 

.5032 

.4903 

.5038 

.5053 

3 

0 O 

-0.0181 

-0.0137 

0.0018 

-0.0258 

0.0065 


h 

.5016 

.5008 

.5019 

.5056 

.5061 


B 2 

.1075 

.1046 

.0803 

.1189 

.0791 


e o 

-0.0179 

-0.0142 

0.0027 

-0.0257 

0.0066 


6 1 

.5035 

.4859 

.5201 

.5177 

.4956 


e 2 

.1071 

.1050 

.0782 

.1187 

.0790 



-.0025 

.0198 

-.0237 

-.0160 

.0138 

5 

e o 

-0.0100 

0.0012 

0.0159 

-0.0202 

0.0136 


*1 

.4959 

.4940 

.5016 

.5169 

.4962 


0 2 

.0413 

-.0152 

-.0328 

.0897 

.0306 


S 3 

.0061 

' .0081 

0 

-.0151 

.0129 


*4 

.0653 

.1185 

.1116 

.0241 

.0455 

6 


-0.0166 

-0.0044 

0.0157 

-0.0186 

0.0135 



.4232 

.4352 

.4617 

.5039 

.4949 


e 2 

.0891 

.0094 

-.0250 

.0835 

.0309 


s 3 

.2801 

.2087 

.1232 

.0400 

.0179 


84 

.0224 

.0996 

.1030 

.0288 

.0452 


85 

-.2070 

-.1452 

-.0854 

-.0430 

-.0039 


wm 

0.158 

0.106 

0.208 

0.174 

0.132 


wm- 

.252 

.190 

.314 

.010 

.128 


KB 

.108 

.054 

.016 

.306 

.138 


IB: 

.040 

.052 

.014 

.006 

.008 


P4 

.170 

.184 

.100 

.020 

.104 


■w 

.022 

.034 

.114 

.030 

.088 


SB 

.056 

0 

0 

.072 

.002 


P 7 

.074 

.134 

.052 

.208 

.172 


P 8 

.036 

.118 

.110 

.022 

.092 


P9 

.084 

.128 

.072 

.152 

.132 


The values of the posterior probabilities and parameter 
means after five simulations, of 500 observations each, of 
the sequential selection procedure. Also listed are the 
proportions of the times each a(^) was chosen as the op- 
timal experiment. 
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TABLE 10. - SMALL SAMPLE STUDY ONE 

U* = 3, t* - (i), - 8] 


9 . 
min 

T 

U 3,0 

PCS 

ASN 

0.70 

0.5 

(0, 0) 

0.133 

6.36 

.70 

.5 

(0.5, 0.5) 

.458 

7.15 

.70 

.5 

(1.0, 1.0) 

.544 

6.82 

.70 

.5 

(1.5, 1.5) 

.446 

5.89 

.80 

.5 

(0, 0) 

.173 

7.50 

.80 

.5 

(0.5, 0.5) 

.468 

7.78 

.80 

.5 

(1.0, 1.0) 

.531 

7.52 

.80 

.5 

(1.5, 1.5) 

.460 

7.24 

.90 

.5 

(0, 0) 

.229 

7.98 

.90 

.5 

(0.5, 0.5) 

.479 

7.92 

.90 

.5 

(1.0, 1.0) 

.513 

7.81 

.90 

.5 

(1.5, 1.5) 

.439 

7.76 

.70 

1.0 

(0, 0) 

.397 

5.49 

.70 

1.0 

(0.5, 0.5) 

.673 

5.88 

.70 

1.0 

(1.0, 1.0) 

.737 

5.29 

.70 

1.0 

(1.5, 1.5) 

.621 

4.84 

o 

00 

1.0 

(0, 0) 

.558 

6.90 

o 

00 

1.0 

(0.5, 0.5) 

.755 

6.94 

.80 

1.0 

(1.0, 1.0) 

.771 

6.50 

.80 

1.0 

(1.5, 1.5) 

.700 

6.22 
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TABLE 10. - Continued. 


-> 


0 . 
mxn 

T 

U 3,0 

PCS 

ASN 

0.90 

1.0 

(0, 0) 

0.605 

7.80 

.90 

1.0 

(0.5, 0.5) 

.765 

7.45 

.90 

1.0 

(1.0, 1.0) 

.777 

7.15 

.90 

1.0 

(1.5, 1.5) 

.689 

7.12 

.70 

2.0 

(0, 0) 

.699 

4.24 

.70 

2.0 

(0.5, 0.5) 

.871 

4.03 

.70 

2.0 

(1.0, 1,0) 

,877 

3.62 

.70 

2.0 

(1.5, 1.5) 

.723 

3.48 

.80 

2.0 

(0, 0) 

.868 

5.45 

.80 

2.0 

(0.5, 0.5) 

.962 

4.99 

o 

00 

2.0 

(1.0, 1.0) 

.970 

4.63 

o 

00 

2.0 

(1.5, 1.5) 

.872 

4.61 

.90 

2.0 

(0, 0) 

.944 

6.46 

.90 

2.0 

(0.5, 0.5) 

.967 

5.66 

.90 

2.0 

(1.0, 1.0) 

.969 

5.48 

.90 

2.0 

(1.5, 1.5) 

.939 

5.80 


*Not recorded . 

Resulting PCS and ASN values for = ® and 

— V 

the combinations of 0 . , t, and y_ Results 

mm 3,0 

are based upon 1500 simulations of the procedure 


for each combination 
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TABLE 11. 

- SMALL SAMPLE 

STUDY ONE 



u* = 

3 ,T= (i), = 16] 


6 min 

T 

y 3,0 

PCS 

ASN 

0.70 

0.5 

(0, 0) 

0.354 

9.48 

.70 

.5 

(0.5, 0.5) 

.665 

10.7 

.70 

.5 

(1.0, 1.0) 

.723 

9.63 

.70 

.5 

(1.5, 1.5) 

.555 

7.38 

.80 

.5 

(0, 0) 

.508 

13.6 

.80 

.5 

(0.5, 0.5) 

.761 

13.3 

.80 

.5 

(1.0, 1.0) 

.806 

12.3 

.80 

.5 

(1.5, 1.5) 

.661 

11.8 

.90 

.5 

(0, 0) 

.574 

15.5 

.90 

.5 

(0.5, 0.5) 

.752 

14.6 

.90 

.5 

(1.0, 1.0) 

.800 

13.9 

.90 

.5 

(1.5, 1.5) 

.710 

13.8 

.70 

1.0 

(0, 0) 

.548 

6.53 

.70 

1.0 

(0.5, 0.5) 

.821 

6.82 

.70 

1.0 

(1.0, 1.0) 

.825 

6.09 

.70 

1.0 

(1.5, 1.5) 

.637 

5.36 

.80 

1.0 

(0, 0) 

.808 

9.48 

.80 

1.0 

(0.5, 0.5) 

.971 

9.30 

.80 

1.0 

(1.0, 1.0) 

.961 

8.16 

.80 

1.0 

(1.5, 1.5) 

.865 

7.86 
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TABLE 11. - Continued 


6 . 
mm 

T 

U 3,0 

PCS 

ASN 

0.90 

1.0 

(0, 0) 

0.927 

12.1 

.90 

1.0 

(0.5, 0.5) 

.973 

10.8 

.90 

1.0 

(1.0, 1.0) 

.964 

10.1 

.90 

1.0 

(1.5, 1.5) 

.958 

10.6 

.70 

2.0 

(0, 0) 

.700 

4.25 

.70 

2.0 

(0.5, 0.5) 

.878 

4.17 

.70 

2.0 

(1.0, 1.0) 

.855 

3.59 

.70 

2.0 

(1.5, 1.5) 

.714 

3.51 

.80 

2.0 

(0, 0) 

.911 

5.67 

.80 

2.0 

(0.5, 0.5) 

.990 

5.12 

o 

00 

2.0 

(1.0, 1.0) 

.988 

4.84 

.80 

2.0 

(1.5, 1.5) 

.894 

4.71 

.90 

2.0 

(0, 0) 

.996 

7.13 

.90 

2.0 

(0.5, 0.5) 

1.00 

6.25 

.90 

2.0 

(1.0, 1.0) 

1.00 

5.94 

.90 

2.0 

(1.5, 1.5) 

.995 

6.20 

Resulting PCS 

and ASN values 

f0r J MAX 

= 16 and 


the combinations of 0 . , x, and n . Results 

tain’ ’ 3,0 

based upon 1000 simulations. 
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TABLE 12. - SMALL SAMPLE STUDY TWO 



. * 

U = 2, 

*1 

II 

/-N 

H* 

J MAX 

8] 

9 , 
min 

T 

y 2,0 

PCS 

ASN 

0.70 

0.5 

(1.0) 

0.760 

7.86 

.80 

.5 

(1.0) 

.734 

7.98 

.90 

.5 

(1.0) 

.740 

7.98 

.70 

1.0 

(.5) 

.828 

7.63 

.70 

1.0 

(1.0) 

.882 

7.20 

.70 

1.0 

(1.5) 

.800 

6.86 

.80 

1.0 

(1.0) 

.872 

7.98 

.90 

1.0 

(.5) 

.880 

7.97 

.90 

1.0 

(1.0) 

.898 

7.98 

.90 

1.0 

(1.5) 

.832 

7.99 

.70 

2.0 

(1.0) 

.900 

5.13 

.80 

2.0 

(1.0) 

.936 

7.89 

.90 

2.0 

(1.0) 

.934 

7.98 

The PCS 

and ASN 

values resulting 

from 500 


simulations of the sequential procedure 

for each of the tabulated combinations of 

0 , x,. and y 9 n . 

min ^ y U 
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Figure 1. - Tabulations of the prior means of the parameters 
and graphs of the resulting functions over the interval 
[-1, +l] for large sample polynomial study one. 



Figure 2. - Tabulations of the prior means of the param- 
eters and graphs of the resulting functions over the 
interval [-1, + 1] for large sample polynomial study 
two. 
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Figure 5. - Illustration of prior means for 
performance simulation experiment one. 
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Figure 6. - Illustration of prior means for small 
sample performance simulation experiment two. 



